Language, Social, and Face Regions Are Affected in Toddlers with Autism and Predictive of Language Outcome

Identifying prognostic early brain alterations is crucial for autism spectrum disorder (ASD). Leveraging structural MRI data from 166 ASD and 109 typical developing (TD) toddlers and controlling for brain size, we found that, compared to TD, ASD toddlers showed larger or thicker lateral temporal regions; smaller or thinner frontal lobe and midline structures; larger callosal subregion volume; and smaller cerebellum. Most of these differences were replicated in an independent cohort of 38 ASD and 37 TD toddlers. Moreover, the identified brain alterations were related to ASD symptom severity and cognitive impairments at intake, and, remarkably, they improved the accuracy for predicting later language outcome beyond intake clinical and demographic variables. In summary, brain regions involved in language, social, and face processing were altered in ASD toddlers. These early-age brain alterations may be the result of dysregulation in multiple neural processes and stages and are promising prognostic biomarkers for future language ability.


Introduction
ASD is a neurodevelopmental disorder characterized by social and communicative de cits and repetitive behaviors emerging at ages 1 to 4 years 1, 2 . ASD affects approximately 1 in 44 children in the United States 3 . The high prevalence rate of ASD and associated social and language de cits signi cantly elevate the risk of adverse outcomes for individuals with ASD and increase the burden for the involved families and the whole society. Clinical heterogeneity of ASD is considerable 4,5,6,7,8 : Some toddlers bene t from contemporary applied behavior analysis treatments, but others do not. Some toddlers may earn a college degree and live independently, but others remain minimally-verbal with life-long struggles with social communication. While language and social symptoms improve with age in some toddlers, they do not for others, and such outcome differences are not clearly predictable from clinical scores at very early ages 4,5,6,7,8 . Characterizing ASD neuropathology at the age of clinical onset, and how it relates to clinical heterogeneity, is essential for aiding early diagnosis, prognosis, and early interventions.
Converging evidence from neuroanatomical studies suggests brain overgrowth in young children with ASD 9,10,11,12,13,14,15 , especially in frontal and temporal regions 2,14,15,16,17 , while other brain regions show inconsistent brain alteration patterns in ASD. For example, both volume increases and reductions have been reported in the amygdala 18, 19, 20 , corpus callosum 21,22,23,24 , and cerebellum 25,26,27,28 . The inconsistent results may be due to cohort (e.g., subject characteristics), MRI scanner, preprocessing pipeline, and analytical methodology differences 20,29 . Moreover, most studies focused on global measures or single regions (e.g., amygdala, cerebellum, and corpus callosum) and single morphometries (e.g., volume, surface area, cortical thickness) of interest that may be relevant to ASD. In this cortex, surface area (SA) and cortical thickness are dissociable features 30 ; examining potential alterations in both features in the same sample may point to distinct biological origins of cortical gray matter changes.
No study of brain alterations in young children with ASD has yet examined regional differences across the brain and examined volume, cortical thickness, and SA in a comprehensive manner.
Brain size alterations have been widely reported to underlie language and social de cits and facial recognition impairment in ASD. For example, volumes in frontal and temporal regions were related to repetitive behavior and social and communication de cits in ASD as revealed in an unbiased voxel-based morphometry study 25 or a source-based morphometry (a multivariate approach) study 31

. Moreover,
Dziobek and colleagues identi ed that increased cortical thickness in the fusiform gyrus was associated with more severe face processing impairments in individuals with autism 32 . These studies used a crosssectional design and examined an older sample among whom compensatory neural alterations may have resulted from behavioral challenges rather than caused them.
There are heterogeneous developmental courses in ASD; some ASD toddlers get better, and others get worse with age 33,34,35 . Our previous work demonstrated that degree of functional hypoactivation of ASD toddlers in the temporal region in response to a language task markedly improved the accuracy for classifying language outcome when combined with behavioral and clinical variables 33 . However, it's not clear whether structural alterations of subcortical and cortical regional size identi ed at the earliest clinic visit can contribute to discriminating different prognosis trajectories.
To shed light on this, we rst examined complete and replicable regional early brain alterations in a large sample of toddlers (166 ASD, 109 typically developing (TD)). Speci cally, we comprehensively and systematically investigated differential regional brain volume and cortical SA and thickness measurements in ASD compared to TD toddlers, while controlling for global brain size. We then examined the replicability of discovered regional differences in an independent toddler cohort (38 ASD, 37 TD) using the same preprocessing pipeline and the same statistical methods. We further investigated whether these brain alterations were associated with contemporaneous behavioral manifestations of ASD quanti ed by symptom severity assessed using Autism Diagnostic Observation Schedule (ADOS), and cognitive and behavioral performance evaluated using Mullen and Vineland. Lastly, we investigated whether including regional size measures found to be altered at intake age, would improve a model's ability to predict language outcome at 3-4 years of age beyond intake clinical and behavioral measures.

Results
ASD vs. TD brain structure difference in main sample Early brain overgrowth in ASD is one of the widely reported ndings on ASD brain structural development. Hence, we started by examining ASD vs. TD difference of global brain measures (i.e., the estimated total intracranial volume (eTIV), total cortical SA, and mean cortical thickness) in our samples using linear mixed effect models (LMEMs, see details in Methods) while adjusting effects from age, sex ( xed effect) and longitudinal scans (random effect). In the main sample (N = 275, see Table 1 for their demographic and clinical characteristics at the time of initial scanning), no signi cant ASD vs. TD difference was observed for eTIV (p = 0.96), total cortical volume (p = 0.07), total cortical SA (p = 0.49), or mean cortical thickness (p = 0.47). However, toddlers with poor language outcome (ASD Poor, Mullen expressive and Page 5/23 receptive language T score < 40 at 3-4 years of age) presented signi cantly greater total cortical volume compared to TD toddlers (p = 2.56×10 − 3 , Cohen's d (referred as d hereafter) = 0.39, beta = 8.62).
Using LMEMs (see details in Methods) while adjusting effects from age, sex, brain global measurements ( xed effects) and longitudinal scans (random effect), we found four cortical regions had signi cant volume differences between ASD and TD toddlers after FDR at p < 0.05 correction ( Fig. 1   Note, values for age and all clinical test scores are presented as mean (SD). SD represents standard deviation. ADOS SA represents ADOS social affect, and ADOS RRB presents ADOS restricted and repetitive behavior.
Moreover, six cortical regions showed a signi cant thickness difference between ASD vs. TD toddlers ( Fig. 1 upper right). Compared to TD, ASD toddlers had signi cantly thicker cortex in LH superior temporal and decreased volume in right cerebellum cortex (p = 1.56×10 − 2 , d = -0.41, beta = -1.82) compared to TD (Fig. 2 right). None of the three cortical regions showing signi cant SA differences were replicated (p > 0.05). Violin plots of brain regions that were replicated for ASD vs. TD differences are presented in supplemental Fig. S2.
Associations between brain-size adjusted regional measurements and behavior are more pronounced in ASD toddlers and showed region-by-diagnosis interaction effects Among 13 regions showing signi cant ASD vs. TD differences in the main sample, four were signi cantly related to ADOS symptom severity or Mullen subscale scores after FDR correction ( Fig. 3 and Fig. S3). In ASD toddlers ( Associations were strongly negative in the ASD group, but near zero or positive in the TD group. Moreover, SA in RH caudal anterior cingulate signi cantly interacted with diagnosis to predict Mullen ratio RL (p = 1.52 × 10 − 2 ). Scatter plots of signi cant brain-behavior associations are presented in Fig. S4.
Identi ed early regional brain sizes substantially improved the accuracy for predicting later language outcome in ASD toddlers As previously employed for prognostic analyses 33,34,35 , we strati ed language outcome of ASD toddlers as ASD Good or ASD Poor based on Mullen expressive language (EL) and receptive language (RL) T scores at outcome visit. An ASD toddler was grouped as ASD Poor if both Mullen EL and RL T scores were below − 1 SD of the T score norm of 50 (i.e., T < 40). An ASD toddler was classi ed as ASD Good if the toddler had either Mullen EL or RL T scores equal to or greater than − 1 SD of the normative T score of 50 (i.e., T ≥ 40). Out of 166 ASD toddlers, 157 had a Mullen evaluation at outcome visit and were strati ed into two outcome groups: ASD Good (N = 69; 59 males, 10 females; age = 33.88 ± 4.44 months) and ASD Poor (N = 88; 71 males, 17 females; age = 34.55 ± 5.18 months). These 157 ASD toddlers were further used for language outcome prediction analysis. Their Mullen EL and RL T scores at outcome visit are displayed in Fig. S5, where ASD Good toddlers showed similar language outcome as TD toddlers.
We employed support vector machine (SVM) with ridge regularization to classify language outcome (ASD Good/Poor  Figure 4 plots the performance of clinical/demographic only, sMRI only, and clinical/demographic + sMRI models for classifying ASD Good vs. Poor language outcome. Sensitivity and speci city re ect the accuracy for correctly detecting ASD Poor and ASD Good, respectively. Combining intake clinical/demographic and sMRI features yielded the highest accuracy (81%) and area under the receiver operating characteristic curve (AUC = 79%) compared to that from a single modality (sMRI only model: accuracy = 69%, AUC = 63%; clinical/demographic only model: accuracy = 72%, AUC = 70%). The clinical/demographic only model achieved slightly higher accuracy than the sMRI only model, especially for detecting ASD Good toddlers (i.e., speci city). sMRI had the highest accuracy in detecting ASD Poor toddlers (i.e., sensitivity). Fig. S6 displays the contribution (weight) of each intake clinical/demographic and sMRI feature to predicting the language outcome of ASD toddlers.

Discussion
In this study, we surveyed the volume, thickness, and surface area of all regions across the brain to observe which size measures were reproducibly altered in ASD toddlers compared to TD toddlers. Identi ed brain regions are mainly involved in receptive and expressive language, social and face processing (bank STS, middle temporal, superior temporal, medial orbitofrontal, caudal anterior cingulate, posterior cingulate, pars opercularis, caudal middle frontal) 36, 37,38,39,40,41,42,43,44,45,46 . Additional regions included those involved in motor, behavioral, cognitive, and language control; primary visual processing and interhemispheric communication (cerebellum; primary visual cortex, corpus callosum) 47,48,49,50,51,52,53,54 . Morphometrically, we observed alterations in regional volume, thickness, and surface area relative to global measures. Thus, by rst factoring out brain size, differentially increased or decreased growth in different anatomic measures in ASD-relevant language, social, face processing and behavior regulation regions were isolated and highlighted. Cortically, frontal lobe and midline structures tended to be smaller or thinner in ASD than TD; lateral temporal regions tended to be larger or thicker in ASD. Outside the cortex, larger callosal subregion volume and smaller cerebellum were observed. The majority of the identi ed GMV and cortical thickness alterations were replicated in an independent cohort. Importantly, larger (i.e., more aberrant) GMV in LH fusiform, LH and RH middle temporal were related to more severe ADOS social symptoms and/or poorer Mullen cognitive (ELC, ratio RL, and ratio VR) performance in ASD toddlers. These relationships were signi cantly stronger in the ASD compared to TD group. Of clinical relevance, the identi ed brain features measured at intake (mean age = 2.5 years), when included in a predictive model along with clinical and demographic features, markedly improved the accuracy for classifying good vs. poor language outcome for toddlers with ASD at 3-4 years of age.
The identi ed regional alterations were largely consistent with previous ndings. Studies have found that young children 2,14,15,17 , adolescents, and adults 55 with ASD show GMV enlargement in the temporal lobe, especially in the superior, middle temporal and fusiform gyri 56 . Increased cortical thickness in left hemisphere superior temporal cortex (LH STC) also appears to be a very strong and replicable nding in the literature, as evident in other large-scale studies in primarily adolescents and adults 57,58 . The current results showcase that increased LH STC thickness is present even earlier in ASD in toddlerhood and with larger effect sizes than studies in older ASD individuals. This developmentally ubiquitous increase in cortical thickness of LH STC may yield insight into early development processes that contribute to cortical thickness (e.g., proliferation of excitatory neuronal cell types in different cortical layers).
Furthermore, normative brain charts indicate that cortical thickness tends to peak in early childhood followed by slow decline over the lifespan 59 , so these ASD toddler results combined with others in older ASD samples would indicate that increased early developmental cortical thickening combined with attenuated cortical thinning of LH STC may be a robust and key neural feature of ASD neurodevelopment. Given the observations of early developmental functional abnormalities in LH STC for language 33, 34, 60 , these converging results may indicate that atypical structural development and underlying genomic mechanisms affecting LH STC 34, 35 may perturb the ability of this region to develop functional specialization for processes like language and social-communication.
GMV reduction in the cerebellum has been well-documented for individuals with ASD spanning from childhood to late adulthood 25,26,27 . Postmortem studies also reveal that subjects with ASD have decreased number 61 and reduced size 62 of Purkinje cells in the cerebellar hemisphere and vermis. The identi ed volume increase in CC in ASD toddlers aligns with the nding that infants with ASD have signi cantly increased SA and thickness in CC starting at 6 months of age, and the increase is particularly robust in the anterior CC at both 6 and 12 months 22 . Other studies 21,22,23,24 suggest that CC in individuals with ASD likely undergoes overgrowth at early ages 22 , followed by abnormally slow or arrested growth, and later shows a reduction in adulthood 23,24 . Our results of SA reduction in the orbitofrontal cortex and posterior cingulate are consistent with a recent study led by Ecker 63 . Moreover, the identi ed alterations in thickness align with the nding by Zielinski et. al. 64 that individuals with ASD have reduced thickness in the bilateral caudal middle frontal and the left pars opercularis during childhood and adolescence as well as in the right pars opercularis during adulthood.
By rst factoring out brain size, we revealed differential abnormality in cortical patterning in multiple ASDrelevant language, social, face processing and behavior regulation regions. Abnormality was manifest in a complex map of differentially increased or decreased GM volume, surface area and thickness and highlights the presence of dysregulated regional cortical growth. These early-age regional alterations of cortical attributes may be the result of progressive dysregulation in multiple neural processes and stages, consistent with prenatal multi-process, multi-stage models of ASD 65, 66 . This advances our recent nding of atypical anterior-posterior and dorsal-ventral genetical cortical patterning in ASD toddlers with poor language and social outcomes 35 . In that study, atypical gene co-expression included genes involved in prenatal cortical patterning; all progenitor cell types involved in symmetrical and asymmetrical cell division that can alter surface area and cortical thickness; and excitatory neurons, oligodendrocyte precursors, endothelial cells, and microglia that may affect thickness. Thus, effects span multiple prenatal stages and growth processes, that we hypothesize lead to the multiple growth deviances in volume, surface area, and thickness across key cortical regions that we report here.
One mechanism that could be involved in these effects is the overactivity of a prenatal multi-pathway gene network, a gene dysregulation presented in ASD-derived prenatal progenitors and neurons and We found that toddlers with ASD who had more aberrant brain measures also showed more severe symptoms and poorer cognitive performance. The identi ed brain-behavior associations largely aligned with previous ndings. Recently, Grecucci and colleagues 31 reported that larger GMV in an autismspeci c structural network (including fusiform and middle temporal gyri) was related to higher ADOS subscales (social affect and restricted and repetitive behavior) and total scores. Rojas et. al. 25 also reported that GMV in the temporal region was positively associated with social and communication total score. A study led by Dziobek reported that increased cortical thickness in the fusiform gyrus was related to impairments in face processing in individuals with ASD 32 , consistent with our result that fusiform GMV was negatively related to the Mullen ratio VR score.
The identi ed brain regions were highly valuable for characterizing prognosis. The sMRIclinical/demographic combined model achieved the highest accuracy for classifying ASD Good vs. ASD Poor, which parallels our functional imaging nding that a multimodal fMRI-clinical model outperformed single modality models 33 . Integrating multiple modalities can take full advantage of both modalityunique and complementary information from other modalities that is key for parsing ASD heterogeneity.
Notably, although sMRI model had the highest accuracy (sensitivity) for detecting ASD Poor, the accuracy for ASD Good was low. There were two possible reasons: 1) our samples included more ASD Poor than ASD Good toddlers, resulting in better detection of ASD Poor characteristics than that of ASD Good; and 2) the features input to SVM were more pronounced in ASD Poor than ASD Good in general (See  supplemental Tables S1-S3), although a few showed reversed patterns.
The ndings presented in this study should be considered in context with its strengths and limitations.
Using brain regions showing signi cant ASD vs. TD differences as input for SVM reduced the likelihood of over tting of the model. However, we may have missed other features that might also be important for discriminating ASD Good from ASD Poor. Future research should include a full exploration of all FreeSurfer features and training a more comprehensive model to improve the accuracy for detecting ASD Good. Another limitation is that while a majority of the identi ed brain alterations were replicated, further replication with larger samples is still necessary, especially for regions showing SA differences.
In summary, ASD toddlers showed GM alterations in regions mainly involved in language, social, face processing, and primary visual cortex. Most of the identi ed GM alterations were replicated in an independent cohort. Moreover, the identi ed GM alterations were associated with greater ASD symptom severity and cognitive impairments and showed great potential as prognostic biomarkers for language outcome prediction.

Methods
This study was approved by the Institutional Review Board at the University of California, San Diego (UCSD). Written informed consent was obtained from parents or legal guardians for all toddlers included in this study. Parents or legal guardians were compensated for their participation.

Main sample
All toddlers were recruited through community referrals or a general population-based screening method called Get SET Early 73 , previously known as the 1-Year Well-Baby Check-Up Approach 74, 75 , allowing detection of ASD at early ages (e.g., ~ 12 months). Toddlers were tracked from an intake assessment (1-3 years of age) and followed roughly every 12 months until 3 to 4 years of age (outcome visit). All toddlers participated in a series of clinical and behavioral assessments at each visit, including ADOS (Module T, 1, or 2) for ASD symptom evaluation 76, 77, 78 , the Mullen Scales of Early Learning 79 for evaluating early cognition, and the Vineland Adaptive Behavior Scales 80 for assessing a child's functional skills in four different developmental domains.
All assessments were performed by licensed psychologists with PhD degrees and occurred at UCSD Autism Center of Excellence. Diagnosis at the most recent clinical visit was used in this study. Diagnosis of ASD is determined by highly experienced and licensed psychologists using diagnostic criteria in DSM 81 or 1 in combination with the gold-standard ADOS evaluation 82 . TD toddlers showed no history of any developmental delay. Since some toddlers with ASD were scored at the oor of the standardized scores on Mullen subscales, we computed a ratio score for each subscale by dividing the age equivalent score by the toddler's chronological age 83,84,85,86 . We used these ratio scores to evaluate their associations with brain morphometry.
Clinical and behavioral scores and sMRI scans were collected from 343 toddlers (198 ASD and 145 TD).  88 to provide global and regional brain morphometric measures, including total brain volume, total surface area, mean cortical thickness, cortical sub-regional volume/SA/thickness, and subcortical volumes. FreeSurfer aligns each toddler's brain to an average brain derived from cortical folding patterns through nonlinear surface-based registration 89 . This tool has been validated for studies of children 90 and has shown great success in large pediatric studies 35,91,92 .
Quality evaluation was further performed on the raw and segmented sMRI scans by two independent raters with a rating scale ranging from 0 to 3 (0 = best, 1 = great, 2 = usable, 3 = unusable). Out of 447 sMRI scans, 75 were rated as unusable and were excluded from the study, yielding 372 scans.
Replication sample 76 toddlers (38 ASD and 38 TD) recruited in our previous study 14 were used as a replication sample.
Toddlers were recruited through clinical referral or advertisements and were diagnosed by the same licensed psychologist with the above-mentioned criteria. sMRI scans were collected at the same site with a 1.5T Siemens Symphony system during toddler's natural sleep at night. A total of 170 sMRI scans were collected at intake and follow-up visits. All replication sMRI scans were preprocessed with FreeSurfer 5.3 using the same pipeline and same Linux platform as used for main samples. Similarly, sMRI scans with excessive motion or bad segmentation quality were excluded, yielding 167 scans from 75 unique toddlers (38 ASD, 37 TD; 55 male, 20 female) for testing the replicabilities of ASD vs. TD differences identi ed from main sample. The detailed participant recruitment, diagnosis evaluation, and scan collection information can be found in 14 .
Brain structure difference between ASD and TD toddlers For both the main and replication samples, ASD vs. TD differences in global and regional brain size were examined using the same LMEMs as described later. Brain global measurement (eTIV, total cortical SA, and mean cortical thickness) differences between ASD and TD were tested using the LMEM: Brain global measure = β 0 + β 1 × diagnosis + β 2 × scan age + β 3 × sex + subject ID + ε.
where each global brain measure was treated as the dependent variable, and xed-effect predictors included diagnosis, age at scan, and sex. Subject was treated as a random effect to take longitudinal scans into account. Diagnosis was coded as a dummy variable (ASD = 1, TD = 0). Thus, for each brain region tested, the beta value of diagnosis can be interpreted as how much larger/smaller (unit: cm for thickness, cm 2 for SA, cm 3 for volume) ASD toddlers' brains are compared to TDs' brains. ASD vs. TD differences in cortical and subcortical volume, cortical regional surface area and thickness were tested using the LMEM as below: Regional volume/SA/thickness = β 0 + β 1 × diagnosis + β 2 × scan age + β 3 × sex + β 4 × brain global measure + subject ID + ε.
where volume/SA/thickness of each brain region was treated as the dependent variable. Subject was treated as a random effect, and other predictors (diagnosis, age at scan, sex, and brain global measure) were modeled as xed effects. Brain global measures included eTIV for testing sub-cortical and cortical regional volume, total cortical SA for testing regional SA, and mean cortical thickness for testing regional thickness measures. To identify cortical regions with signi cant volume/SA/thickness differences between ASD and TD in the main sample, a false discovery rate (FDR) at p < 0.05 was applied to correct for 68 comparisons (68 LH and RH cortical regions). FDR at p < 0.05 was also applied to correct for comparisons of subcortical regions, cerebellum (LH and RH), and corpus callosum (CC) regions separately. The identi ed ASD vs. TD differences were considered as replicated if the corresponding p values were less than 0.05 in the replication sample.

Brain-behavior association analyses
To understand the behavioral signi cance of observed brain alterations, associations between behavioral measures (ADOS, Mullen, and Vineland) evaluated at the time of scan and brain regions showing signi cant ASD vs. TD differences were examined in ASD and TD toddlers separately using the linear regression model: Behavioral measure = β 0 + β 1 × volume/SA/thickness of a brain region + β 2 × age + β 3 × sex + ε.
where each behavioral measure is treated as the response variable, and age, sex, and volume/SA/thickness of a brain region were predictors. FDR at p < 0.05 was applied to correct for multiple comparisons. Moreover, brain-by-diagnosis interaction effects in predicting behavioral measures were investigated using the regression model: Behavioral measure = β 0 + β 1 × volume/SA/thickness of a brain region + β 2 × diagnosis + β 3 × volume/SA/thickness of a brain region × diagnosis + β 4 × age + β 5 × sex + ε.

Predicting language outcome for ASD toddlers
We employed SVM with ridge regularization to predict language outcome. SVM with ridge can select features of importance to achieve a stable classi cation result. We tested and evaluated three different models: clinical/demographic only, sMRI only, and clinical/demographic + sMRI models. The clinical/demographic only model used behavioral (ADOS, Mullen and Vineland) and demographic (sex, age at intake, and gap between intake and outcome visit) variables at intake visit. The sMRI only model leveraged age and sex-adjusted intake FreeSurfer measures (age and sex effects were estimated using TD data 93 ) within regions that showed signi cant ASD vs. TD differences. The clinical/demographic + sMRI model used all intake features included in clinical/demographic only and sMRI only models. Each variable/feature was scaled to be between 0 and 1 prior to SVM for all models. Each model was crossvalidated with the training samples (80% samples) using 5-fold cross-validation, and its performance was evaluated with a hold-out testing set (20% samples). Accuracy, sensitivity, speci city, and AUC were computed to re ect the performances of prediction models. contributed to funding acquisition. All authors contributed to interpreting the results and discussion.

Competing interests
The authors report no competing interests. Figure 1 Brain regions showing signi cant differences between ASD and TD toddlers in the main sample in terms of cortical volume, non-cortical volume, cortical thickness and cortical SA. Colors represent corresponding effect sizes (Cohen's D), where regions with hot colors showed signi cant increases in size among ASD compared to TD and regions with cold colors showed signi cant decreases in size among ASD compared to TD; the darker the color, the larger the difference between ASD and TD.

Figure 2
Brain regions replicated for ASD vs. TD differences in cortical volume, non-cortical volume and cortical thickness. Colors represent corresponding effect sizes (Cohen's D), where regions with hot colors showed signi cant increases in size among ASD compared to TD and regions with cold colors showed signi cant decreases in size among ASD compared to TD; the darker the color, the larger the difference between ASD and TD. Figure 3